-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clarify allocation explain if random shard chosen #75670
Clarify allocation explain if random shard chosen #75670
Conversation
Today we often encounter users that are confused by the behaviour of calling `GET _cluster/allocation/explain` without a body: it _seems_ to work, but it explains a random shard, and if this isn't the shard they're thinking of then it's unclear how to proceed. With this commit we add a note to the response when a shard was randomly chosen indicating that it is possible, and possibly useful, to explain a different shard. We also adjust the exception message in the case when all shards are assigned to indicate why it's an invalid request and what to do to make it valid.
Pinging @elastic/es-distributed (Team:Distributed) |
Pinging @elastic/clients-team (Team:Clients) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The spec changes LGTM 👍
In my opinion we should always require a specific shard (unless there is only 1 unassigned shard) |
I know what you mean, but my experience has been that in many cases there are N (>1) unassigned shards, but they're all unassigned all for the same reason so explaining any of them is actually useful. There are other alternatives, we could try harder to make sure we're explaining a useful target (e.g. prefer primaries over replicas) or explain more of them in the diagnostics tooling, but simply making it an error not to specify a shard in this API would mean we're guaranteed to need another round-trip to the user which seems bad. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @DaveCTurner, this looks good to me. Apologies for the delay in reviewing. I left some minor comments, mostly for my own education. Thank you again for making the change.
@@ -25,6 +25,8 @@ GET _cluster/allocation/explain | |||
|
|||
`GET _cluster/allocation/explain` | |||
|
|||
`POST _cluster/allocation/explain` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. I assume that POST should have been documented already but we had missed it, but please correct me if that is not the case. If that is the case, should we also be including examples for POST, similar to the ones we have for GET, or a note that they are equivalent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The convention is that all GET
APIs that take a body should also work the same with POST
, see e.g. https://www.elastic.co/guide/en/elasticsearch/reference/current/api-conventions.html#api-conventions
I've added this here to emphasise that you can send a request with a body - some tooling makes it hard to execute a GET
with a body, you have to switch it into POST
model
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it, @DaveCTurner thanks for clarifying. Good to know there is general guidance in the documentation already.
@@ -53,6 +66,11 @@ public ClusterAllocationExplanation(ShardRouting shardRouting, @Nullable Discove | |||
} | |||
|
|||
public ClusterAllocationExplanation(StreamInput in) throws IOException { | |||
if (in.getVersion().onOrAfter(Version.V_8_0_0)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume BwC stands for backwards compatibility, is that correct? (I've seen it several times already, but I figured I should finally confirm).
I assume that adding an extra field in the API response is not a breaking change, is that correct?
I am also assuming we are being extra cautious and not introducing it in 7.x, but we could have technically done so, had we wanted to, is that correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's correct, BwC is short for backwards compatibility.
Adding an extra field to the response is indeed not considered a breaking change, we expect clients to skip fields they don't recognise.
The labels on the PR indicate how this will be backported, so yes this will go into 7.x. However we have to start out with the change being 8.0-only because we need it to pass all the mixed-version tests in CI first. We'll then backport it and (simultaneously) adjust the version on this line to match. It's quite a complex dance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it, @DaveCTurner, thanks a lot for the context.
Today we often encounter users that are confused by the behaviour of calling `GET _cluster/allocation/explain` without a body: it _seems_ to work, but it explains a random shard, and if this isn't the shard they're thinking of then it's unclear how to proceed. With this commit we add a note to the response when a shard was randomly chosen indicating that it is possible, and possibly useful, to explain a different shard. We also adjust the exception message in the case when all shards are assigned to indicate why it's an invalid request and what to do to make it valid.
Today we often encounter users that are confused by the behaviour of
calling
GET _cluster/allocation/explain
without a body: it seems towork, but it explains a random shard, and if this isn't the shard
they're thinking of then it's unclear how to proceed.
With this commit we add a note to the response when a shard was randomly
chosen indicating that it is possible, and possibly useful, to explain a
different shard. We also adjust the exception message in the case when
all shards are assigned to indicate why it's an invalid request and what
to do to make it valid.